gpu process pegged with 97% usage loading PremierOpinion's pmls64.dll
Categories
(Core :: Graphics, defect)
Tracking
()
Tracking | Status | |
---|---|---|
firefox113 | --- | fixed |
People
(Reporter: chutten, Assigned: yannis)
References
Details
(Whiteboard: [win:stability])
Attachments
(1 file)
User reported severe lag in Firefox. Instructed them to check about:processes
, used it to track it to the GPU process, and used the built in "profile all threads for 5s" button (kudos to all teams involved in making this so very straightforward to do) to gather https://share.firefox.dev/42lFKgI which shows a whole lotta DllLoad
for pmls64.dll
which is connected to "PremierOpinion".
Anything we can do about this? The user's taking actions to scrub the unwanted third-party software... but could we have prevented it from causing such grief in Firefox?
Updated•2 years ago
|
Comment hidden (obsolete) |
Comment 2•2 years ago
|
||
Am I reading the profile wrong? It looks like the DllLoad
calls are taking on the order of 250us, while the NotifyObservers
calls are taking on the order of 1us. So while I agree that we could probably do something more sophisticated about notifying, it doesn't seem like that would help very much in this case.
We have seen issues like this before when we add a DLL that gets loaded by a window hook to the DLL blocklist, and our advice is...don't block those DLLs, which isn't helpful in this case since we're not doing the blocking.
Comment 3•2 years ago
|
||
I may be misreading this as well, not a profiler expert, but it also looks to me like most of the time we're in NotifyObservers is being spent on recording data for the profile itself, code that would be skipped when a profile isn't being collected. So I do think the observer notifications might be a red herring here.
Updated•2 years ago
|
Assignee | ||
Comment 4•2 years ago
•
|
||
Thanks Greg and Molly for your comments, they were helpful. Indeed my first comment was incorrect.
I managed to reproduce exactly the profile shared in the description, as well as the severe lag. It happens when a DLL that is registered with SetWindowsHookExW
for WH_GETMESSAGE
gets blocked by our blocklist. I managed to reproduce the profile by registering my own DLL and blocking it in about:third-party
, then I realized that the story is a little bit different for pmls64.dll
...
(In reply to Greg Stoll from comment #2)
We have seen issues like this before when we add a DLL that gets loaded by a window hook to the DLL blocklist, and our advice is...don't block those DLLs, which isn't helpful in this case since we're not doing the blocking.
... In fact we are doing the blocking! This DLL is hard-blocked in our static blocklist.
First let me clarify what we see in the profile now that I understand it better. For a start, the raw addresses ending in 74 and b4 without symbols correspond to the following stacks:
# Child-SP RetAddr Call Site
00 0000001f`e6ffdb78 00007ff6`2bdb03d6 0x00000216`7a8c2c74 (orig_OpenFile)
01 0000001f`e6ffdb80 00007ff6`2bdb5125 firefox!TargetNtOpenFile+0x56 [/builds/worker/checkouts/gecko/security/sandbox/chromium/sandbox/win/src/filesystem_interception.cc @ 125]
02 0000001f`e6ffdcf0 00007ff9`3c46064c firefox!TargetNtOpenFile64+0x35 [/builds/worker/checkouts/gecko/security/sandbox/chromium/sandbox/win/src/interceptors_64.cc @ 120]
03 0000001f`e6ffdd30 00007ff9`3c4610e0 ntdll!LdrpMapDllNtFileName+0xe8
04 0000001f`e6ffde30 00007ff9`3c460eaf ntdll!LdrpMapDllFullPath+0xe0
05 0000001f`e6ffdfc0 00007ff9`3c4788c4 ntdll!LdrpProcessWork+0x77
06 0000001f`e6ffe010 00007ff9`3c468c9c ntdll!LdrpLoadDllInternal+0x1a0
07 0000001f`e6ffe0b0 00007ff9`3c47a24a ntdll!LdrpLoadDll+0xb0
08 0000001f`e6ffe270 00007ff6`2bd98803 ntdll!LdrLoadDll+0xfa
09 (Inline Function) --------`-------- firefox!mozilla::interceptor::FuncHookCrossProcess<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyUnique<mozilla::interceptor::MMPolicyOutOfProcess> >,long (*)(wchar_t *, unsigned long *, _UNICODE_STRING *, void **)>::operator()+0x19 [/builds/worker/workspace/obj-build/dist/include/nsWindowsDllInterceptor.h @ 254]
0a 0000001f`e6ffe360 00007ff9`39bbbb92 firefox!mozilla::freestanding::patched_LdrLoadDll+0x1b3 [/builds/worker/checkouts/gecko/browser/app/winlauncher/freestanding/DllBlocklist.cpp @ 360]
0b 0000001f`e6ffe440 00007ff9`3a51e067 KERNELBASE!LoadLibraryExW+0x172
0c 0000001f`e6ffe4b0 00007ff9`3c4f2c94 user32!_ClientLoadLibrary+0xa7
0d 0000001f`e6ffe650 00007ff9`39f314d4 ntdll!KiUserCallbackDispatcherContinue
0e 0000001f`e6ffe768 00007ff9`3a531bdf win32u!NtUserPeekMessage+0x14
0f 0000001f`e6ffe770 00007ff9`3a531aac user32!_PeekMessage+0x3f
10 0000001f`e6ffe7e0 00007ff8`63d37d4c user32!PeekMessageW+0x9c
...
# Child-SP RetAddr Call Site
00 0000001f`e6ffdb98 00007ff6`2bdb0885 0x00000216`7a8c2cb4 (orig_QueryAttributes)
01 0000001f`e6ffdba0 00007ff9`3c47ac39 firefox!TargetNtQueryAttributesFile+0x35 [/builds/worker/checkouts/gecko/security/sandbox/chromium/sandbox/win/src/filesystem_interception.cc @ 204]
02 0000001f`e6ffdce0 00007ff9`3c47a3f7 ntdll!LdrpGetNtPathFromDosPath+0x95
03 0000001f`e6ffddc0 00007ff9`3c461072 ntdll!LdrpResolveDllName+0x103
04 0000001f`e6ffde30 00007ff9`3c460eaf ntdll!LdrpMapDllFullPath+0x72
05 0000001f`e6ffdfc0 00007ff9`3c4788c4 ntdll!LdrpProcessWork+0x77
06 0000001f`e6ffe010 00007ff9`3c468c9c ntdll!LdrpLoadDllInternal+0x1a0
07 0000001f`e6ffe0b0 00007ff9`3c47a24a ntdll!LdrpLoadDll+0xb0
08 0000001f`e6ffe270 00007ff6`2bd98803 ntdll!LdrLoadDll+0xfa
09 (Inline Function) --------`-------- firefox!mozilla::interceptor::FuncHookCrossProcess<mozilla::interceptor::WindowsDllInterceptor<mozilla::interceptor::VMSharingPolicyUnique<mozilla::interceptor::MMPolicyOutOfProcess> >,long (*)(wchar_t *, unsigned long *, _UNICODE_STRING *, void **)>::operator()+0x19 [/builds/worker/workspace/obj-build/dist/include/nsWindowsDllInterceptor.h @ 254]
0a 0000001f`e6ffe360 00007ff9`39bbbb92 firefox!mozilla::freestanding::patched_LdrLoadDll+0x1b3 [/builds/worker/checkouts/gecko/browser/app/winlauncher/freestanding/DllBlocklist.cpp @ 360]
0b 0000001f`e6ffe440 00007ff9`3a51e067 KERNELBASE!LoadLibraryExW+0x172
0c 0000001f`e6ffe4b0 00007ff9`3c4f2c94 user32!_ClientLoadLibrary+0xa7
0d 0000001f`e6ffe650 00007ff9`39f314d4 ntdll!KiUserCallbackDispatcherContinue
0e 0000001f`e6ffe768 00007ff9`3a531bdf win32u!NtUserPeekMessage+0x14
0f 0000001f`e6ffe770 00007ff9`3a531aac user32!_PeekMessage+0x3f
10 0000001f`e6ffe7e0 00007ff8`63d37d4c user32!PeekMessageW+0x9c
...
Now, why do we not have full stacks in the profile for orig_OpenFile
, orig_QueryAttributes
, NtClose
, NtCreateSection
, ZwMapViewOfSection
, ZwQueryVirtualMemory
, ZwUnmapViewOfSection
? Because they all happen in LdrLoadDll
, and before calling into LdrLoadDll
we disable stack walking over concerns about the function table locks, and we restore it only after the call. So even though these calls seem to appear out of nowhere in the profile, they should be understood as having the same call stack as what I just shared above.
When we understand that, it confirms that the process spends ~100% of its time in nsThread::ProcessNextEvent
. In fact, we are facing an infinite cycle of events, where the code that we execute to process an event will itself generate a new DLL load event that needs to be processed, and again, leaving the process no time to rest. Here is how the cycle appears:
- a first legitimate native event arrives;
- we reach
nsThread::ProcessNextEvent
to look for an event; - that calls into
nsBaseAppShell::OnProcessNextEvent
, which calls intoPeekMessageW
; - there is an event, and
pmls64.dll
is not loaded yet, so we reach_ClientLoadLibrary
to loadpmls64.dll
where the registered hook lives; - that calls
patched_LdrLoadDll
, which will fail to load the DLL because it is blocked, then will call intoDllServices::DispatchDllLoadNotification
to register a new DLL load event for the main thread, without actually posting it for the moment; - since loading the DLL failed, we handle the original event without applying the hook, then we return from
PeekMessageW
; - the next call to
nsThread::ProcessNextEvent
will reachEnsureMainThreadTasksScheduled
, where the DLL load event will be posted withPostMessage
as a new legitimate native event, right before going intonsBaseAppShell::OnProcessNextEvent
, thus starting a new iteration of the cycle...
Note that we only have a cycle because of blocking. If we do not block the DLL, we avoid the cycle:
- if the DLL loads successfully, further calls into
PeekMessageW
will not call into_ClientLoadLibrary
for the same library anymore, they will call the hook directly; - if the DLL fails to load for another reason than blocking (e.g.
DllMain
returnsFALSE
onDLL_PROCESS_ATTACH
), we do not generate any DLL load event.
The cycle described above exists even when the profiler is not attached.
Assignee | ||
Comment 5•2 years ago
•
|
||
(In reply to Yannis Juglaret [:yannis] from comment #4)
Note that we only have a cycle because of blocking. If we do not block the DLL, we avoid the cycle:
- if the DLL loads successfully, further calls into
PeekMessageW
will not call into_ClientLoadLibrary
for the same library anymore, they will call the hook directly;- if the DLL fails to load for another reason than blocking (e.g.
DllMain
returnsFALSE
onDLL_PROCESS_ATTACH
), we do not generate any DLL load event.
[:gcp] noticed that by blaming that code, we can see that the possibility to register events about failure to load a blocked DLL was added in bug 1686229. It allows us to log information in third-party module pings about which DLLs have been actively blocked. We think we should be able to break the cycle by sending at most one event per blocked DLL.
Assignee | ||
Comment 6•2 years ago
•
|
||
I created a Github repository to share the reproducer.
Steps to reproduce:
- Install the latest Microsoft Visual C++ Redistributable package for x64.
- Download
HookTests.exe
andHookLib.dll
from the release section of the repository. Keep them in the same folder. - (Recommended) Check integrity of the files from a Powershell window. The hashes on your computer should match:
PS > Get-FileHash .\HookTests.exe
Algorithm Hash Path
--------- ---- ----
SHA256 5EBEDD581258A1C29F4883B8B443C2B4EF86F442C0D53954AA6E4A8409DAB14D .\HookTests.exe
PS > Get-FileHash .\HookLib.dll
Algorithm Hash Path
--------- ---- ----
SHA256 CC4509C7D621B0F69584FF7FB082C9DC63C178133D03ED362C713DAF065D7C56 .\HookLib.dll
- Double click on
HookTests.exe
. SmartScreen prevents you from running the executable. ClickMore info
,Run anyway
. HookTests.exe
launches and should show:SUCCESS: The hook is installed
.- Open Firefox and navigate to
about:third-party
. You should seeHookLib.dll
listed. - In Firefox, click
Reload with system info
. A new buttonBlock this module
appears nearHookLib.dll
. Click it. ClickRestart Firefox now
. - When Firefox restarts, it should be extremely slow. If you can, navigate to
about:processes
, it should show that the main process and the GPU process together use around 100% of CPU. - Back in
HookTests.exe
, press ENTER, it should show:SUCCESS: The hook is uninstalled
. Press ENTER again. - Back in Firefox, you should now experience normal performance.
- Navigate to
about:third-party
. ClickReload with system info
. Click theCurrently blocked
button nearHookLib.dll
to unblock the module. ClickRestart Firefox now
.
Expected result: Once this bug is fixed, performing these steps should not result in bad performance at any point of Firefox use. The main process and the GPU process should not use 100% CPU together but have normal CPU usage.
Assignee | ||
Updated•2 years ago
|
Assignee | ||
Comment 7•2 years ago
|
||
I was able to reproduce the issue following the STR from comment 6 in a local build. Applying the following patch removes the issue, suggesting that the analysis from comment 4 is correct:
diff --git a/toolkit/xre/dllservices/mozglue/LoaderObserver.cpp b/toolkit/xre/dllservices/mozglue/LoaderObserver.cpp
--- a/toolkit/xre/dllservices/mozglue/LoaderObserver.cpp
+++ b/toolkit/xre/dllservices/mozglue/LoaderObserver.cpp
@@ -58,9 +58,7 @@ void LoaderObserver::OnEndDllLoad(void*
utf8RequestedDllName.get()));
}
- // We want to record a denied DLL load regardless of |aNtStatus| because
- // |aNtStatus| is set to access-denied when DLL load was blocked.
- if ((!NT_SUCCESS(aNtStatus) && !aModuleLoadInfo.WasDenied()) ||
+ if (!NT_SUCCESS(aNtStatus) ||
!aModuleLoadInfo.WasMapped()) {
return;
}
Of course this patch would prevent logging blocked modules in third-party module pings, so this is not the patch we want to apply to solve this issue.
Assignee | ||
Comment 8•2 years ago
|
||
Updated•2 years ago
|
Assignee | ||
Comment 9•2 years ago
•
|
||
Here are profiles captured from a local build while the hook is active, following the STR from comment 6, before and after the proposed patch:
- before patch: GPU and parent processes are flooded with DLLLoad events for
HookLib.dll
and consuming a lot of CPU, GPU process main thread is active 100% of the time, parent process main thread is active 100% of the time too except when it is waiting for the GPU process to do something; - after patch: the DLLLoad events for
HookLib.dll
correlate with DOMEvent (parent process) and ReceiveMouseInputEvent (GPU), CPU usage is back to normal.
Comment 11•2 years ago
|
||
Comment 12•2 years ago
|
||
bugherder |
Description
•